About Control Tags

A text-to-speech engine can usually translate individual words to speech successfully. However, as soon as the engine speaks a sentence, the perceived quality of its translation decreases because the engine cannot correctly synthesize human prosody -- the inflection, accent, and timing of human speech. You can change a speaking voice by inserting commands in the text file.

Note: Before using control tags, review the Syntax Rules and Conventions.

The prosody of translated speech can be improved by using text-to-speech control tags to better simulate human speech. The following is a list of text-to-speech control tags that can be embedded in the source text to improve the prosody of text-to-speech translation:

Chr - Character

Prt - Part of Speech

(some speech engines may not support this tag)

Com - Comment

RmS - Reading Mode Spelling

(new for SAPI 4.0 -- some speech engines may not support this tag)

Ctx - Context

(some speech engines may not support this tag)

RmW - Reading Mode Audible Pauses

(new for SAPI 4.0 -- some speech engines may not support this tag)

Dem - De-emphasize

(new for SAPI 4.0 -- some speech engines may not support this tag)

RPit - Relative Pitch

(new for SAPI 4.0 -- some speech engines may not support this tag)

Emp - Emphasize

(some speech engines may not support this tag)

RPrn - Relative Pitch Range

(new for SAPI 4.0 -- some speech engines may not support this tag)

Eng - Engine Specific Command

RSpd - Relative Speed

(new for SAPI 4.0 -- some speech engines may not support this tag)

Pau - Speech

Rst - Reset Engine

(new for SAPI 4.0)

Pitch - Baseline Pitch

 Spd - Talking Speed

Pra - Pitch Range

(new for SAPI 4.0 -- some speech engines may not support this tag)

Vce - Speaking Voice

(some speech engines may not support this tag)

Prn - Pronunciation

(some speech engines may not support this tag)

 Vol - Speaking Volume

Pro - Prosodic Rule

(some speech engines may not support this tag)